AITopics

Country:

Europe (0.67)
North America > United States (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry: Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.94)
(4 more...)

Neural Information Processing SystemsJun-13-2026, 23:37:00 GMT

nvBench 2.0: Resolving Ambiguity in Text-to-Visualization through Stepwise Reasoning

Text-to-Visualization (Text2VIS) enables users to create visualizations from natural language queries, making data insights more accessible. However, Text2VIS faces challenges in interpreting ambiguous queries, as users often express their visualization needs in imprecise language. To address this challenge, we introduce nBench 2.0, a new benchmark designed to evaluate Text2VIS systems in scenarios involving ambiguous queries.

artificial intelligence, natural language, proceedings, (13 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.83)

Neural Information Processing SystemsFeb-9-2026, 00:43:44 GMT

62e0973455fd26eb03e91d5741a4a3bb-Paper.pdf

moment-detr, query, video, (13 more...)

Country: North America > United States > North Carolina (0.04)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsDec-25-2025, 08:07:25 GMT

Drill-down: Interactive Retrieval of Complex Scenes using Natural Language Queries

This paper explores the task of interactive image retrieval using natural language queries, where a user progressively provides input queries to refine a set of retrieval results. Moreover, our work explores this problem in the context of complex image scenes containing multiple objects. We propose Drill-down, an effective framework for encoding multiple queries with an efficient compact state representation that significantly extends current methods for single-round image retrieval. We show that using multiple rounds of natural language queries as input can be surprisingly effective to find arbitrarily specific images of complex scenes. Furthermore, we find that existing image datasets with textual captions can provide a surprisingly effective form of weak supervision for this task. We compare our method with existing sequential encoding and embedding networks, demonstrating superior performance on two proposed benchmarks: automatic image retrieval on a simulated scenario that uses region captions as queries, and interactive image retrieval using real queries from human evaluators.

drill-down, interactive retrieval, query, (7 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.68)

Neural Information Processing SystemsDec-24-2025, 05:13:31 GMT

Detecting Moments and Highlights in Videos via Natural Language Queries

Detecting customized moments and highlights from videos given natural language (NL) user queries is an important but under-studied topic. One of the challenges in pursuing this direction is the lack of annotated data. To address this issue, we present the Query-based Video Highlights (QVHighlights) dataset. It consists of over 10,000 YouTube videos, covering a wide range of topics, from everyday activities and travel in lifestyle vlog videos to social and political activities in news videos. Each video in the dataset is annotated with: (1) a human-written free-form NL query, (2) relevant moments in the video w.r.t. the query, and (3) five-point scale saliency scores for all query-relevant clips. This comprehensive annotation enables us to develop and evaluate systems that detect relevant moments as well as salient highlights for diverse, flexible user queries. We also present a strong baseline for this task, Moment-DETR, a transformer encoder-decoder model that views moment retrieval as a direct set prediction problem, taking extracted video and query representations as inputs and predicting moment coordinates and saliency scores end-to-end. While our model does not utilize any human prior, we show that it performs competitively when compared to well-engineered architectures. With weakly supervised pretraining using ASR captions, Moment-DETR substantially outperforms previous methods.

highlight, name change, query, (8 more...)

Technology:

Information Technology > Databases (0.82)
Information Technology > Artificial Intelligence > Natural Language (0.62)

Song, Steven, Subramanyam, Anirudh, Zhang, Zhenyu, Venkat, Aarti, Grossman, Robert L.

GDC Cohort Copilot: An AI Copilot for Curating Cohorts from the Genomic Data Commons

arXiv.org Artificial IntelligenceDec-8-2025

The Genomic Data Commons (GDC) provides access to high quality, harmonized cancer genomics data through a unified curation and analysis platform centered around patient cohorts. While GDC users can interactively create complex cohorts through the graphical Cohort Builder, users (especially new ones) may struggle to find specific cohort descriptors across hundreds of possible fields and properties. However, users may be better able to describe their desired cohort in free-text natural language. We introduce GDC Cohort Copilot, an open-source copilot tool for curating cohorts from the GDC. GDC Cohort Copilot automatically generates the GDC cohort filter corresponding to a user-input natural language description of their desired cohort, before exporting the cohort back to the GDC for further analysis. An interactive user interface allows users to further refine the generated cohort. We develop and evaluate multiple large language models (LLMs) for GDC Cohort Copilot and demonstrate that our locally-served, open-source GDC Cohort LLM achieves better results than GPT-4o prompting in generating GDC cohorts. We implement and share GDC Cohort Copilot as a containerized Gradio app on HuggingFace Spaces, available at https://huggingface.co/spaces/uc-ctds/GDC-Cohort-Copilot. GDC Cohort LLM weights are available at https://huggingface.co/uc-ctds. All source code is available at https://github.com/uc-cdis/gdc-cohort-copilot.

large language model, machine learning, natural language, (13 more...)

doi: 10.1093/bioadv/vbaf295

2507.02221

Country: North America > United States (0.48)

Genre: Research Report (0.83)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

arXiv.org Artificial IntelligenceNov-3-2025

LAFA: Agentic LLM-Driven Federated Analytics over Decentralized Data Sources

Ji, Haichao, Wang, Zibo, Pan, Cheng, Han, Meng, Zhu, Yifei, Wang, Dan, Han, Zhu

Abstract--Large Language Models (LLMs) have shown great promise in automating data analytics tasks by interpreting natural language queries and generating multi-operation execution plans. However, existing LLM-agent-based analytics frameworks operate under the assumption of centralized data access, offering little to no privacy protection. In contrast, federated analytics (F A) enables privacy-preserving computation across distributed data sources, but lacks support for natural language input and requires structured, machine-readable queries. In this work, we present LAF A, the first system that integrates LLM-agent-based data analytics with F A. LAF A introduces a hierarchical multi-agent architecture that accepts natural language queries and transforms them into optimized, executable F A workflows. T o improve execution efficiency, an optimizer agent rewrites and merges multiple DAGs, eliminating redundant operations and minimizing computational and communicational overhead. Our experiments demonstrate that LAF A consistently outperforms baseline prompting strategies by achieving higher execution plan success rates and reducing resource-intensive F A operations by a substantial margin. This work establishes a practical foundation for privacy-preserving, LLM-driven analytics that supports natural language input in the F A setting. The rapid development of Large Language Models (LLMs) has offered unprecedented capabilities in natural language understanding, reasoning, and planning [1], significantly transforming the landscape of data analytics. LLMs can interpret complex analytical intents, generate structured code, and orchestrate multi-step tasks by interacting with external environments such as databases and computation sandboxes. These capabilities have led to the emergence of LLM-based agents that decompose high-level queries, plan analytical workflows, and execute or verify results through tool interactions.

large language model, machine learning, natural language, (19 more...)

2510.18477

Country:

Asia > China (0.95)
North America > United States > California (0.14)
Europe > Austria > Vienna (0.14)

Genre:

Workflow (0.97)
Research Report (0.82)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.82)

Jehle, Dominik, Purucker, Lennart, Hutter, Frank

Agentic NL2SQL to Reduce Computational Costs

arXiv.org Artificial IntelligenceOct-17-2025

Translating natural language queries into SQL queries (NL2SQL or Text-to-SQL) has recently been empowered by large language models (LLMs). Using LLMs to perform NL2SQL methods on a large collection of SQL databases necessitates processing large quantities of meta-information about the databases, which in turn results in lengthy prompts with many tokens and high processing costs. To address this challenge, we introduce Datalake Agent, an agentic system designed to enable an LLM to solve NL2SQL tasks more efficiently. Instead of utilizing direct solvers for NL2SQL that call the LLM once with all meta-information in the prompt, the Datalake Agent employs an interactive loop to reduce the utilized meta-information. Within the loop, the LLM is used in a reasoning framework that selectively requests only the necessary information to solve a table question answering task. We evaluate the Datalake Agent on a collection of 23 databases with 100 table question answering tasks. The Datalake Agent reduces the tokens used by the LLM by up to 87\% and thus allows for substantial cost reductions while maintaining competitive performance.

large language model, machine learning, natural language, (17 more...)

2510.14808

Country: Europe > Germany > Baden-Württemberg (0.16)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

arXiv.org Artificial IntelligenceOct-17-2025

GQVis: A Dataset of Genomics Data Questions and Visualizations for Generative AI

Walters, Skylar Sargent, Valderrama, Arthea, Smits, Thomas C., Kouřil, David, Nguyen, Huyen N., L'Yi, Sehi, Lange, Devin, Gehlenborg, Nils

Data visualization is a fundamental tool in genomics research, enabling the exploration, interpretation, and communication of complex genomic features. While machine learning models show promise for transforming data into insightful visualizations, current models lack the training foundation for domain-specific tasks. In an effort to provide a foundational resource for genomics-focused model training, we present a framework for generating a dataset that pairs abstract, low-level questions about genomics data with corresponding visualizations. Building on prior work with statistical plots, our approach adapts to the complexity of genomics data and the specialized representations used to depict them. We further incorporate multiple linked queries and visualizations, along with justifications for design choices, figure captions, and image alt-texts for each item in the dataset. We use genomics data retrieved from three distinct genomics data repositories (4DN, ENCODE, Chromoscope) to produce GQVis: a dataset consisting of 1.14 million single-query data points, 628k query pairs, and 589k query chains. The GQVis dataset and generation code are available at https://huggingface.co/datasets/HIDIVE/GQVis and https://github.com/hms-dbmi/GQVis-Generation.

artificial intelligence, machine learning, natural language, (16 more...)

2510.13816

Country: North America > United States (0.46)

Genre: Research Report (0.40)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.41)

arXiv.org Artificial IntelligenceOct-9-2025

Agentic generative AI for media content discovery at the national football league

Wang, Henry, Salekin, Md Sirajus, Lee, Jake, Claytor, Ross, Zhang, Shinan, Chi, Michael

Generative AI has unlocked new possibilities in content discovery and management. Through collaboration with the National Football League (NFL), we demonstrate how a generative-AI based workflow allows media researchers and analysts to query relevant historical plays using natural language, rather than using traditional filter and click-based interfaces. The agentic workflow takes a user query in natural language as an input, dissects the query into different elements, and then translates these elements into the underlying database query language. The accuracy and latency of retrieval are further improved through carefully designed semantic caching. The solution performs with over 95-percent accuracy and reduces the average time of finding relevant videos from 10 minutes to 30 seconds, significantly increasing the NFL's operational efficiency and allowing users to focus more on producing creative content and engaging storylines.

artificial intelligence, machine learning, natural language, (13 more...)

2510.07297

Country: North America > United States (0.14)

Genre:

Workflow (0.57)
Press Release (0.42)
Research Report (0.40)

Industry: Leisure & Entertainment > Sports > Football (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.83)